Automatic trend detection: Time-biased document clustering

نویسندگان

چکیده

Abstract Identifying the trending topics in journals and conferences is valuable for understanding role of authors, institutions, funding agencies progression knowledge produced field. However, many available clustering methods do not accommodate a desire temporally clustered results that are typical trends, part because time publication often neglected as feature. As demonstration how can be emphasized trend detection, we use novel approach introducing weighted temporal feature to bias topic toward articles similar frame; this performed over set finance journal abstracts from 1974 2020. Latent Dirichlet Allocation (LDA) used parameterize each abstract, followed by dimensionality reduction using Singular Value Decomposition (SVD). We detect identifiable when standard with no bias. To identify topics, utilize metric silhouette score divided deviation clusters time. then isolate identified validate them expert judgment. Our strategy readily utilized other fields discovering rise fall trends.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trend-based Document Clustering for Sensitive and Stable Topic Detection

The ability to detect new topics and track them is important given the huge amounts of documents. This paper introduces a trend-based document clustering algorithm for analyzing them. Its key characteristic is that it gives scores to words on the basis of the fluctuation in word frequency. The algorithm generates clusters in a practical time, with O(n) processing cost due to preliminary calcula...

متن کامل

Automatic Document Clustering Using Topic Analysis

Web users are demanding more out of current search engines. This can be noticed by the behaviour of users when interacting with search engines [12, 28]. Besides traditional query/results interactions, other tools are springing up on the web. An example of such tools includes web document clustering systems. The idea is for the user to interact with the system by navigating through an organised ...

متن کامل

Stock Data Clustering and Multiscale Trend Detection

Generally, trend detection algorithms over the data stream require expert assistance in some form. We present an unsupervised multiscale data stream algorithm which detects trends for evolving time series based on a data driver data stream. The raw stream data clustering algorithm is incremental, space dilating and has linear time complexity. The evolving stream is incrementally explored on a n...

متن کامل

Automatic Table Detection in Document Images

In this paper, we propose a novel technique for automatic table detection in document images. Lines and tables are among the most frequent graphic, non-textual entities in documents and their detection is directly related to the OCR performance as well as to the document layout description. We propose a workflow for table detection that comprises three distinct steps: (i) image pre-processing; ...

متن کامل

Automatic Borders Detection of Camera Document Images

When capturing a document using a digital camera, the resulting document image is often framed by a noisy black border or includes noisy text regions from neighbouring pages. In this paper, we present a novel technique for enhancing the document images captured by a digital camera by automatically detecting the document borders and cutting out noisy black borders as well as noisy text regions a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge Based Systems

سال: 2021

ISSN: ['1872-7409', '0950-7051']

DOI: https://doi.org/10.1016/j.knosys.2021.106907